Skip to content

Conversation

@anivar
Copy link
Contributor

@anivar anivar commented Sep 26, 2025

Fixes critical issue where users are randomly logged out due to TokenOrchestrator clearing tokens for ANY non-network error, including transient AWS service issues.

Resolves #14534

Description

The handleErrors() method was clearing tokens for all errors except NetworkError, causing users to be logged out during temporary AWS service issues (500s, rate limits, throttling). This PR modifies the token clearing logic to only clear tokens for definitive authentication failures.

This is a minimal fix with ~15 lines of actual code changes that solves a critical production issue affecting many users.

Implementation Details

TokenOrchestrator Changes

  • Removed overly broad token clearing condition that cleared tokens for any non-network error
  • Added isAuthenticationError() helper method to identify definitive auth failures
  • Tokens are now only cleared for errors that require re-authentication
  • Preserves tokens for all transient/retryable errors with safe default behavior

Errors that clear tokens (auth failures):

  • NotAuthorizedException - Refresh token expired/invalid
  • TokenRevokedException - Token explicitly revoked
  • UserNotFoundException - User doesn't exist
  • PasswordResetRequiredException - Password reset required
  • UserNotConfirmedException - Account not confirmed

Errors that preserve tokens (transient):

  • InternalErrorException - AWS service errors (500s)
  • TooManyRequestsException - Rate limiting
  • ThrottlingException - Request throttling
  • ServiceUnavailable - Temporary outages
  • NetworkError - Network issues
  • Any other unknown errors (safe default)

Testing

  • Added 8 comprehensive test cases covering all error scenarios
  • Tests verify tokens are cleared only for auth errors
  • Tests verify tokens are preserved for transient errors
  • All existing tests pass

Related Issues

Based on issue analysis:

Impact

Users will no longer be randomly logged out during AWS service issues. This is a critical fix that maintains backward compatibility while solving a major production issue affecting many users.

As discussed in the issue comments with @ahmedhamouda78 and @soberm, this approach only clears tokens for known authentication failures rather than all non-network errors.

…learing

Fixes the critical issue where users are randomly logged out due to TokenOrchestrator clearing tokens for ANY non-network error, including transient AWS service issues.

Resolves aws-amplify#14534

The `handleErrors()` method was clearing tokens for all errors except NetworkError, causing users to be logged out during:
- AWS service errors (500s)
- Rate limiting (TooManyRequestsException)
- Temporary service unavailability
- Other transient errors that should be retried

Modified token clearing logic to only clear tokens for definitive authentication failures that require re-authentication. Transient errors now preserve tokens, allowing for retry without forcing users to log in again.

- Removed overly broad token clearing condition
- Added `isAuthenticationError()` helper to identify auth failures
- Only clears tokens for errors that definitively indicate invalid/expired tokens
- Preserves tokens for all transient/retryable errors

- `NotAuthorizedException` - Refresh token expired/invalid
- `TokenRevokedException` - Token explicitly revoked
- `UserNotFoundException` - User no longer exists
- `PasswordResetRequiredException` - Password reset required
- `UserNotConfirmedException` - Account not confirmed

- `InternalErrorException` - AWS service errors
- `TooManyRequestsException` - Rate limiting
- `ThrottlingException` - Request throttling
- `ServiceUnavailable` - Temporary service issues
- `NetworkError` - Network connectivity issues
- Any other transient or unknown errors

- Added comprehensive test coverage for all error scenarios
- Tests verify correct token clearing behavior for each error type
- All existing tests pass

This fix ensures users remain logged in during temporary AWS service issues while still properly handling genuine authentication failures.
@alexkates
Copy link

It's been 3 weeks... any progress on this?

'UserNotConfirmedException', // User account is not confirmed
];

return authErrorNames.some(errorName => err.name?.startsWith(errorName));
Copy link
Contributor

@osama-rizk osama-rizk Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For safety can we use optional chaining with the err.name and with startsWith function here? is the suggested change return authErrorNames.some(errorName => err?.name?.startsWith?.(errorName));

Copy link

@alexkates alexkates Oct 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can we speed up adding this additional optional chain?

Also, the error has already been asserted as a service error on 184, so err is defined here.

@osama-rizk osama-rizk added the Auth Related to Auth components/category label Oct 30, 2025
osama-rizk
osama-rizk previously approved these changes Oct 31, 2025
@osama-rizk
Copy link
Contributor

Hey @alexkates ,
Apologies for the delay we are working on this PR to get merged ASAP.

@alexkates
Copy link

Hey @alexkates , Apologies for the delay we are working on this PR to get merged ASAP.

TY Amplify team!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Auth Related to Auth components/category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Critical Error] Users being randomly logged out in production

4 participants